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ABSTRACT 

The development of new methods for gene addition 
to mammalian genomes is necessary to overcome 
the limitations of conventional genetic engineering 
strategies. Although a variety of DNA-modifying 
enzymes have been used to directly catalyze the in- 
tegration of plasmid DNA into mammalian genomes, 
there is still an unmet need for enzymes that target a 
single specific chromosomal site. We recently en- 
gineered zinc-finger recombinase (ZFR) fusion pro- 
teins that integrate plasmid DNA into a synthetic 
target site in the human genome with exceptional 
specificity. In this study, we present a two-step 
method for utilizing these enzymes in any cell type 
at randomly-distributed target site locations. The 
piggyBac transposase was used to insert recombin- 
ase target sites throughout the genomes of human 
and mouse cell lines. The ZFR efficiently and specif- 
ically integrated a transfected plasmid into these 
genomic target sites and into multiple transposons 
within a single cell. Plasmid integration was depend- 
ent on recombinase activity and the presence of re- 
combinase target sites. This work demonstrates the 
potential for broad applicability of the ZFR technol- 
ogy in genome engineering, synthetic biology and 
gene therapy. 

INTRODUCTION 

Technologies for introducing gene sequences into mam- 
malian cells are central to numerous applications in medi- 
cine, biopharmaceutical production and mechanistic 
studies of gene function. Similarly, the burgeoning fields 
of synthetic biology and metabolic engineering are 
founded on complex genetic engineering of cell systems. 



Ideally, these applications would involve the addition of 
genes to specific sites in the genome that facilitate desir- 
able gene expression characteristics and minimize aberrant 
effects on the cell. However, current methods for chromo- 
somal gene addition use viral delivery vehicles or 
DNA-modifying enzymes that integrate DNA sequences 
semi-randomly in the billions of base pairs of mammalian 
genomes. This approach has the potential to disrupt 
endogenous gene sequences that leads to unpredictable 
consequences on cell activity (1). Additionally, isogenic 
cell lines must be clonally derived after gene addition to 
ensure robust and uniform levels of gene expression across 
the cell population. Methods for reproducibly integrating 
genes at specific genomic target sites would overcome 
these challenges and enable robust genome manipulation 
for diverse fields of biotechnology and biological research. 

Retroviral and lentiviral vectors are the conventional 
delivery vehicles for gene addition to mammalian genomes. 
These vectors integrate semi-randomly into the genome 
with a preference for promoters or intragenic regions of 
actively transcribed genes (2,3). In several gene therapy 
clinical trials, integration of the strong viral promoters 
nearby proto-oncogenes has led to gene deregulation and 
clonal expansions (4-6). Consequently, these vectors are 
not useful for applications that require targeted gene 
addition. The piggyBac and Sleeping Beauty transposon 
systems have both been used to integrate genes into mam- 
malian genomes in vitro and in vivo (7-9). Because the 
transposon systems do not contain the strong viral pro- 
moters, it is anticipated that activation of nearby onco- 
genes will be unlikely. Additionally, Sleeping Beauty, and 
to a lesser extent piggyBac, appear to integrate more 
randomly into the genome than y-retroviral vectors and 
do not show as strong a preference for integration into 
genes (8,10,11). However, transposition into an oncogene 
or tumor suppressor and subsequent insertional mutagen- 
esis has been demonstrated in genetic screens that are 
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designed to favor these events (12,13). Although a few 
studies have attempted to direct gene transposition by 
these enzymes to genomic target sites by the fusion of 
sequence-specific DNA-binding proteins, the majority of 
gene addition events in these systems are still random 
(14,15). 

In contrast to the random addition of genes by viral 
vectors and transposases, the Cre and Flp recombinases 
catalyze the exchange of DNA strands between loxP and 
FRT sequences, respectively (16,17). Cre and Flp have 
both been used to target plasmid integration into loxP 
or FRT sites that have been pre-introduced into mamma- 
lian cell genomes (17,18). The efficiency of Flp-mediated 
plasmid integration is comparable to random plasmid in- 
tegration (17,19). Therefore additional selection or screen- 
ing steps are necessary to ensure site-specific integration. 
Cre expression is toxic to mammalian cells (20,21) and can 
lead to chromosomal rearrangements by reacting with 
off-target pseudo-loxP sites present in the human genome 
(22,23). Similarly, the phage-derived integrase phiC31 
catalyzes the integration of a transfected plasmid into 
pseudo-recognition sites in the human genome (24,25). 
Thorough characterization of the pseudo-recognition 
sites in human cells identified over 100 distinct genomic 
integration sites, with a slight preference for integration 
into intragenic regions (26). Chromosomal rearrange- 
ments have also been observed in human cells following 
phiC31 expression (27,28). Although recent in vivo safety 
studies suggest that phiC31 expression does not lead to 
oncogenic insertional mutagenesis (29), there remains a 
clear need for enzymes with strict DNA-binding 
domains that recognize unique sites within mammalian 
genomes. 

The Cys2-His2 zinc-finger domain is the most common 
DNA-binding motif in the human proteome. A single zinc 
finger contains ~30 amino acids and typically functions by 
binding three consecutive base pairs of DNA via inter- 
actions of a single amino acid side chain per base pair 
(30). The specificity of particular zinc fingers for the 64 
possible nucleotide triplets has been examined extensively 
through site-directed mutagenesis, rational design and the 
selection of large combinatorial libraries (31-33). 
The modular structure of the zinc-finger motif permits 
the fusion of several domains in series, allowing for the 
recognition and targeting of extended sequences in mul- 
tiples of 3 nucleotides (34). It is now possible to design 
synthetic zinc-finger proteins to bind practically any target 
site in the human genome (35,36). 

These targeted DNA-binding proteins can be fused to 
enzymatic domains to direct enzyme activity to specific 
sites in the genome. This approach has been most prom- 
inently exemplified by the development of zinc-finger nu- 
cleases (ZFNs), in which the synthetic zinc-finger protein 
is fused to the catalytic domain of the Fokl restriction 
endonuclease (37). When expressed within mammalian 
cells, ZFNs cleave DNA to create a double-strand break 
at a targeted genomic locus (37). This DNA cleavage 
stimulates DNA repair pathways and increases the effi- 
ciency of homologous recombination at the site by several 
orders of magnitude, which otherwise occurs below back- 
ground levels of random plasmid integration in human 



cells. This method has been used to incorporate gene se- 
quences at specific locations in the genomes of cells from a 
variety of species, including human cell lines and embry- 
onic and adult stem cells (37). However, the potential for 
off-target DNA cleavage, the induction of the DNA- 
damage response pathway and the associated genotoxicity 
that has been often observed with these enzymes remain 
concerns for this method (37-39). 

Inspired by the success of the ZFN technology, we have 
recently developed zinc-finger recombinases (ZFRs) to au- 
tonomously perform precise gene addition to the human 
genome without cleaving genomic DNA and activating 
the DNA damage response pathway (40). ZFRs are a 
fusion of a synthetic zinc-finger protein and the catalytic 
domain of a serine recombinase (41,42). For the chromo- 
somal integration of plasmid DNA, the designed zinc- 
finger domain binds to specific target sites in the genome 
and the plasmid, and the recombinase domain catalyzes 
the exchange of DNA strands (40). In the original dem- 
onstration of this approach, a single model recombinase 
target site was introduced into a specific, but unknown, 
chromosomal location in the human HEK-293 cells using 
the Flp-In™ cell lines and reagents from Invitrogen. We 
demonstrated that ZFRs could target plasmid integration 
into this site with >98% specificity (40). This specific in- 
tegration occurred only if the correct target site was pre- 
sent and the DNA-binding domain of the ZFR contained 
at least three zinc-finger motifs. In the current study, we 
sought to evaluate the general applicability of this 
system by investigating ZFR-mediated plasmid integra- 
tion into target sites in diverse regions of the genome 
and in a variety of cell lines. The piggyBac transposon 
system was used to distribute ZFR recognition sites 
randomly throughout the genome in a population of 
cells. Genomic integration of a transfected plasmid was 
dependent on both an active ZFR and the presence of 
ZFR target sites in the genome, demonstrating the strin- 
gent selectivity of these enzymes. ZFR and piggyBac 
activity were both dependent on cell type. Clonal 
analysis showed that the vast majority of stably trans- 
fected cells contained integration events at the intended 
target site, and in some instances the ZFR was able to 
integrate plasmids into multiple transposon target sites 
within the same cell. These results support the broad 
utility of the ZFR technology and suggest possibilities 
for engineering of isogenic cell lines and stem cell-based 
gene therapies. 

MATERIALS AND METHODS 

Plasmids 

The plasmids pTpB and pCMV-pB were provided by 
Matthew H. Wilson (8). The pTpB plasmid contains the 
piggyBac transposon, which carries a kan R /neo R cassette 
driven by the SV40 promoter and a pl5A origin of repli- 
cation for propagation in Escherichia coli. pCMV-pB 
carries the expression cassette for the piggyBac trans- 
posase. For our piggyBac expression vector, the 
piggyBac transposase was PCR amplified and inserted 
into a pcDNA3.1-Zeocin ZFR expression plasmid (40) 
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in which the ZFR had been removed by SfiT digestion. The 
44-bp ZFR target site was added to pTpB by PCR amp- 
lification of a 1109-bp fragment containing the C.20G 
ZFR target sequence and a neighboring promoterless 
EGFP transgene from the pcDNA5/FRT-EGFP-C.20G 
plasmid (40) and ligation into the BamHI site in pTpB 
downstream of the kan R /neo R cassette in the piggyBac 
transposon. The luciferase-encoding piggyBac transposon 
was created by removing the kan R /neo R cassette in pTpB 
by digestion with Bglll and SacII and ligation of a CMV 
promoter driving the luciferase gene into these sites. The 
ZFR expression plasmid pcDNA3.1-Zeocin-GinC4 and 
the C.20G-Puro ZFR donor plasmid carrying the puro R 
cassette have been described previously (40). The 
luciferase-encoding ZFR donor plasmid was created by 
removing the puro R gene from the C.20G-Puro donor 
plasmid (40) by digestion with Hindlll and Nhel and ligat- 
ing the luciferase gene into these sites, under the control of 
the SV40 promoter. All vector sequences were confirmed 
by DNA sequencing. 



Transposition and plasmid integration 

All cell lines were cultured in DMEM, 10% fetal bovine 
serum, lOOU/ml penicillin G sodium and 100|ig/ml 
streptomycin sulfate in a humidified 5% C0 2 atmosphere 
at 37° C. All cell culture media components and reagents 
were obtained from Invitrogen unless otherwise noted. 
Transfections were performed with Lipofectamine 2000 
(Invitrogen) according to the manufacturer's instructions. 
Twenty-four hours prior to transfection, 100 000 cells 
were plated into 24-well plates. For piggy Bac-mediated 
transposition, cells were transfected with 250 ng of pTpB 
and 250 ng of pcDNA3.1-Zeocin-piggyBac or control 
pcDNA3.1-Zeocin plasmid (no insert). For ZFR- 
mediated donor plasmid integration, cells were transfected 
with 50 ng donor plasmid and 500 ng of control 
pcDNA3.1-Zeocin plasmid (no insert), pcDNA3.1- 
Zeocin-ZFR S9A , or pcDNA3.1-Zeocin-ZFR. 

For colony-counting assays, 10% of the transfected cell 
population was moved to one well of a six-well plate at 
3 days post-transfection. The following day, cell culture 
media was exchanged with media containing 800|ig/ml 
G418 sulfate and/or 2|ig/ml puromycin as appropriate. 
Approximately 14 days later, cells were stained with 
crystal violet solution and colony number was determined 
by automated counting using a GelDoc XR imaging sys- 
tem with Quantity One 1-D analysis software (Bio-Rad). 
Reported colony numbers have been multiplied by a 
factor of ten to account for the initial 1:10 cell division. 
For luciferase assays, cells were continuously cultured in 
the absence of selection with passaging every 3-5 days. At 
each passaging, cell samples were harvested and frozen. At 
the conclusion of the experiment, samples were thawed 
and assayed for luciferase activity with the Luciferase 
Reporter Assay System (Promega) according the manu- 
facturer's instructions using a Veritas Microplate 
Luminometer (Turner Biosystems) equipped with 
injectors. 



Genomic DNA isolation and analysis 

Genomic DNA was purified with QIAamp DNA Blood 
kits (Qiagen) from G418 R /puro R polyclonal cell popula- 
tions or monoclonal populations derived by limiting dilu- 
tion. Genomic DNA was used as template in PCR reactions 
with primer combinations that amplified the unmodified 
ZFR target site (pTpB-priml & pTpB-prim2), the integra- 
tion of the donor plasmid into the ZFR target site in the 
forward orientation (donor-prim 1 & pTpB-prim2), or the 
integration of the donor plasmid into the ZFR target site 
in the reverse orientation (donor-prim 1 & pTpB-priml). 
Primer binding sites are indicated in Figure 1A and C. 
Primer sequences are: pTpB-priml: S'-TTTTCCGGGAC 
GCCGGCTGG-3'; pTpB-prim2: y-CTTGTCGGCCAT 
GATATAGACG-3'; donor-priml: 5'-TGACGTCAATG 
ACGGTAAATGG-3'. 

Southern blots were performed using standard proced- 
ures. Briefly, 1 |ug of genomic DNA was digested with 
Hindlll (Figure 1A and C) for 4h. Digested DNA was 
separated on a 1% agarose gel by electrophoresis. The gel 
was washed and blotted onto a membrane (GeneScreen 
Plus, Perkin Elmer) overnight using an upward capillary 
transfer. The membrane was crosslinked with UV light for 
30 s and dried on filter paper. An 800-bp probe of the neo R 
gene in the pTpB plasmid was generated by PCR with the 
primers 5 / -ATGATTGAACAAGATGGATTGC-3 / and 
5 / -TCGTCAAGAAGGCGATAGAA-3 / and labeled with 
32 P using the PrimelT II kit (Stratagene) according to 
manufacturer's instructions. The blot was pre-hybridized 
in MiracleHyb buffer (Stratagene). Following overnight 
hybridization with the radiolabeled probe at 65° C in a 
rotary hybridization oven, blots were washed twice in 
solution A (2x SSC, 0.1% SDS) and twice in solution B 
(0.01 x SSC, 0.1% SDS) for 15min at 50°C. Labeled blots 
were then exposed to a phosphorimager cassette 
(Molecular Dynamics) for 14 days and visualized on a 
Storm840 phosphorimager (GE Healthcare). 

Plasmid rescue 

Genomic DNA (10 |ig) was digested with Nhel and Xbal, 
which have compatible 5' CTAG overhangs. Nhel cut 
inside the donor plasmid, and Nhel or Xbal were expected 
to cut within the neighboring genomic DNA outside of the 
5'-end of the piggyBac transposon. Digested DNA was 
purified by ethanol precipitation and ligated in dilute con- 
ditions of 300 |il total ligation volume to favor self- 
ligation. The resulting plasmid contained the kan R gene 
and pl5A replication of origin from the pTpB transposon. 
The DNA was concentrated by ethanol precipitation, re- 
suspended in water and electroporated into XL1 Blue 
E. coli and plated onto kanamycin-containing LB agar 
plates. Colonies were screened by colony PCR for plasmids 
that contained the donor plasmid sequence integrated into 
the piggyBac transposon (~10% of all colonies), rather 
than only the transposon. Plasmid DNA from positive 
colonies was purified by miniprep (Qiagen) and sequenced 
with primers extending from both the 5' terminal inverted 
repeat (TR) of the transposon and the puro R gene within 
the donor plasmid (Figure 6). 
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Figure 1. (A) Schematic of plasmids used for piggy Bac-m.Q6ia.tQd transposition. (B) Schematic of plasmids used for ZFR-mediated integration of the 
donor plasmid. (C) Schematic of predicted results of targeted donor plasmid integration into the piggyBac transposon in forward or reverse 
orientations. Small black arrows indicate the positions of PCR primers used for detection of targeted integration. Hindlll restriction sites and 
fragment sizes indicate expected results of Southern blot analysis with a probe against kan R /neo R . (D) Experimental design. Cell lines were first 
co-transfected with a plasmid encoding the piggyBac transposase (pCMV-pB) and a plasmid carrying a piggyBac transposon that contains a ZFR 
target site and neo R /kan R cassette (pTpB). Random transposition was used to generate a polyclonal cell population with ZFR target sites distributed 
throughout the genome. Following selection of neo R cells, these populations were co-transfected with a ZFR expression plasmid (pCMV-ZFR) and a 
donor plasmid containing a ZFR recognition site and puro R cassette to assess targeted plasmid integration into the piggyBac transposon by the ZFR. 



Statistics 

Data are presented from representative experiments as the 
mean of triplicate samples ± standard error of the mean 
(mean ± SEM). Statistical analyses for colony number 
assays included two-sided, two-sample Student's /-test 



assuming equal variances (Figure 2A) or two-way 
ANOVA accounting for both cell type and treatment 
(Figure 3A). Statistical analyses for luciferase assays 
included two-way ANOVA accounting for both time 
and treatment, for which only P- values with respect to 
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Figure 2. Piggy Itaomediated gene transposition into the genome of human and mouse cell lines. Cells were co-transfected with the transposon 
vector and either empty expression plasmid or transposase expression plasmid. (A) Stable chromosomal integration of a piggyBac transposon 
carrying a neo R cassette was measured as the number G418 R cell colonies. Fold increase in colony number is indicated in each graph (n = 3, 
mean ± SEM; HEK293 P = 0.0002; NIH3T3 P = 0.01; HeLa P = 0.0001; HuH-7 P = 0.001). (B) Stable chromosomal integration of a piggyBac 
transposon carrying a luciferase gene was measured as luciferase activity [relative luminescence units (RLU)] in an unselected cell population (n = 3, 
mean ± SEM; HEK293 P = 5E-19, NIH3T3 P = 1E-9, HeLa P = 1E-14, HuH-7 P = 6E-8). 
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Figure 3. ZFR-mediated plasmid integration into the genome of human and mouse cell lines. Unmodified parental cell lines or G418 R 
transposon-modified polyclonal cell populations were co-transfected with a donor plasmid and control (empty), ZFR S9A , or ZFR expression 
plasmid. (A) Stable chromosomal integration of a donor plasmid carrying a puro R cassette was measured as the number of puro R cell colonies 
in = 3, mean ± SEM). P-values for each cell type with respect to expression vector and unmodified vs. transposon modified cells are: HEK293 
P = 1E-8, P = 4E-5; NIH3T3 P = 2E-8, P = 7E-5; HeLa P = 9E-5, P = 9E-3; HuH-7 P = 6E-4, P = 0.03. (B) Stable chromosomal integration of a 
donor plasmid carrying a luciferase gene was measured as luciferase activity in an unselected cell population (n = 3, mean ± SEM; P = 4E-19). 



treatment were reported (Figures 2B and 3B). In order to 
make the variance independent of the mean, statistical 
analysis of luciferase assays was performed following loga- 
rithmic transformation of the raw data. Analyses were 
performed with Microsoft Office Excel 2007 with 
a = 0.05. 



RESULTS 

In order to create a heterogeneous population of poly- 
clonal cells carrying ZFR target sites, the piggyBac trans- 
poson system was used to randomly distribute target sites 
throughout the genomes of several cell lines (Figure 1). 
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The C.20G ZFR target site (40) was added to the piggy Bac 
transposon in the pTpB plasmid, which also contains a 
kanamycin/neomycin resistance (kan R /neo R ) cassette and 
a pl5A origin of replication for propagation in E. coli (8). 
This modified transposon was co-transfected with control 
plasmid DNA or a plasmid encoding the piggyBac 
transposase into four cell lines: human embryonic kidney 
(HEK293) cells, human epithelial cervical cancer (HeLa) 
cells, human hepatocarcinoma (HuH-7) cells and NIH3T3 
mouse embryonic fibroblasts. Cells in which the trans- 
poson was stably integrated into the genome were neomy- 
cin resistant and selected by growth in the presence of 
antibiotic G418. The number of G418-resistant (G418 R ) 
cell colonies was used as a measure of transposase activity 
(Figure 2A). The presence of the transposase significantly 
increased the number of stable integrants, ranging from 
9-fold in HuH-7 cells to 58-fold in HEK293 cells. The level 
of background integration of plasmid DNA, the 
fold-increase of G418 R colonies upon the addition of 
transposase, and the total number of G418 R colonies 
varied greatly among cell lines, as noted previously (43). 

To monitor stable transposition in the absence of select- 
ive pressure, the neo R gene in the piggyBac transposon 
was replaced with a luciferase gene. This plasmid was 
co-transfected into each cell line with control plasmid 
DNA or the transposase expression plasmid and samples 
were harvested over a time course of 40 days to monitor 
stable luciferase gene expression (Figure 2B). In the 
absence of transposase, luciferase levels dropped precipi- 
tously over 10-15 days post-transfection as a result of epi- 
somal plasmid degradation and dilution during cell 
division. Luciferase activity stabilized at a low level after 
this period. In contrast, addition of the transposase re- 
sulted in sustained gene expression at significantly higher 
levels for the duration of the experiment in all cell lines. 

PiggyBac transposes semi-randomly into mammalian 
genomes at TTAA sequences (8). Consequently the 
G418 R cell populations from the samples transfected 
with both transposase and transposon were heteroge- 
neous; each original cell incorporated one or more trans- 
posons into different locations in its genome (44). 
Therefore these polyclonal cell populations were used to 
assess the ability of the ZFR to integrate plasmid DNA 
into different genomic loci in various cell types (Figure 1). 
The ZFR used for this study, GinC4, is a fusion of the 
catalytic domain of the Gin invertase and a four-finger 
designed zinc-finger protein (40). For this experiment, 
both the parental cell lines and the G418 R transposon- 
modified cell lines (Figure 2A) were co-transfected with 
a donor plasmid containing both a ZFR target site and 
a puromycin-resistance (puro R ) cassette and either control 
DNA, a plasmid that expresses a catalytically inactive 
ZFR (ZFR S9A ), or a plasmid that expresses the ZFR. At 
3 days post-transfection, the cells were moved to 
puromycin-containing media for 14 days and the level of 
plasmid integration into the genome was measured as 
the number of puro R cell colonies (Figure 3A). The 
addition of active ZFR to the transposon-modified cells 
increased the level of plasmid integration relative to 
cells without ZFR, cells with inactive ZFR and cells 
with ZFR but without transposon target sites. Similar to 



piggyBac-medmted transposition, the level of background 
plasmid integration, the fold-increase of puro R colonies 
upon the addition of ZFR, and the total number of 
puro R colonies varied greatly among cell lines. The differ- 
ences in plasmid integration levels between the parental 
cell lines and the transposon-modified cell lines when 
both were treated with donor plasmid and active ZFR 
was particularly notable in the NIH3T3, HeLa and 
HuH-7 cell lines. The selectivity of the ZFR was such 
that in the absence of the appropriate 44-bp target site, 
the level of ZFR-mediated recombination into any of the 
>3 billion base pairs of the human or mouse genomes was 
not significantly higher than background plasmid integra- 
tion. It is unclear why there was a measurable increase in 
plasmid integration in the HEK293 cell line upon the 
addition of active ZFR despite the absence of transposon 
target sites, although there was a significant increase when 
the transposon was present. It is also noteworthy that 
binding of the catalytically inactive ZFR S9A to the target 
sites with intact zinc-finger proteins led to a moderate 
increase in colony numbers in HEK293 cells but did not 
have any effect on plasmid integration in the other three 
cell types. Importantly, we have previously validated that 
the ZFR S9A mutant is catalytically inactive in a high- 
throughput excision assay in E. coli (45). Therefore the 
mechanism for ZFR S9A -mediated stable transfection in 
HEK293 cells is unknown, although it is clear that none 
of the plasmid was correctly targeted to the transposon 
(Figure 4). It is possible that binding of the ZFR to its 
target site on the plasmid may modify its stability in the 
absence of any catalytic activity. 

In order to measure the time course of gene expression 
following ZFR-mediated plasmid integration in the ab- 
sence of the selective pressure of antibiotics, the puro R 
gene in the donor plasmid was replaced with the luciferase 
gene. G418 R HEK293 cells were co-transfected with the 
luciferase-encoding donor plasmid and either control 
DNA, the ZFR S A expression plasmid, or the active 
ZFR expression plasmid. Cell samples were collected at 
various time points up to 30 days post-transfection and 
assayed for luciferase activity (Figure 3B). Cells treated 
with donor plasmid and ZFR showed > 10-fold increases 
in the levels of stable gene expression relative to cells 
treated with donor plasmid only. The expression of the 
inactive ZFR S9A led to an intermediate level of stable 
luciferase expression in HEK293 cells similar to the 
colony formation assay (Figure 3A). 

In order to determine whether the plasmid integration 
events mediated by the ZFR were correctly targeted to the 
transposon, genomic DNA was purified from the poly- 
clonal puro cell populations and assayed by PCR 
(Figure 4). PCR primers were used that amplify the un- 
modified transposon or the ZFR donor integrated into the 
transposon in either forward or reverse orientations 
(Figure 1A and C) (40). Although these cell populations 
were polyclonal with respect to transposon integration 
sites, the PCR amplification should occur independently 
of chromosomal location. The PCR products correspond- 
ing to the correct plasmid integration events in forward 
and reverse orientations were present only in samples con- 
taining both the donor and an active ZFR. Unmodified 
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Figure 4. Specificity of ZFR-mediated plasmid integration. Genomic 
DNA from polyclonal G418 R /puro R cell populations that had been 
transfected with donor plasmid and control (empty), ZFR S9A , or 
ZFR expression plasmid was analyzed by PCR. PCR primer combin- 
ations amplified either the unmodified ZFR target site on the piggy Bac 
transposon or the integration of the donor plasmid at this site in either 
forward or reverse orientations. Targeted plasmid integration was 
detected only in samples treated with the ZFR. 



transposons were detected in all cell populations, which is 
not surprising given that it is possible for >15 transpos- 
ition events to occur in a single cell (44) and cells would be 
selected in this assay if the plasmid integrated into at least 
one of these transposons. Although most samples from 
ZFR-treated HeLa cells were positive for site-specific 
plasmid integration, results were inconsistent due to low 
numbers of puro R colonies (Figure 3A) and overgrowth of 
the cell population by small numbers of rapidly expanding 
colonies. Therefore results obtained from experiments 
with HeLa cells are not shown. 

Although the genomic PCR of the polyclonal popula- 
tion indicated that site-specific integration events were 
present within the sample, how frequently these events 
occurred was unknown. Therefore monoclonal cell popu- 
lations were derived from single cells of the puro R cultures 
that had been treated with donor and active ZFR to de- 
termine the overall level of specificity in the population 
(Figure 5). The genomic DNA from these clonal popula- 
tions was analyzed by PCR for targeted integration events 
(Figure 5A). Except for one aberrant HEK293 clone, all of 
the clonal samples contained transposons that had not 
been modified by plasmid integration, which could be pre- 
dicted by previous studies showing multiple transpositions 
per cell (44). Of 24 HEK293 clones, only five clones had 



no targeted integration events, including the single clone 
in which the unmodified transposon was not detectable. 
These clones represent cells in which the puro R donor plas- 
mid integrated at an off-target site. Thirteen clones had 
only forward integrations, four clones had only reverse 
integrations, and two clones contained both reverse and 
forward integrations. All 20 of the HuH-7 clones and 19 of 
the 20 NIH3T3 clones contained targeted integration 
events, and the vast majority of these (17/20 NIH3T3 
and 17/20 HuH-7 clones) contained both forward and 
reverse integrations. The PCR products from selected 
samples were sequenced to confirm the accurate junction 
of the transposon and donor plasmid. Southern blots of 
the genomic DNA from a subset of the HEK293 clones 
verified the PCR results and validated the PCR-based 
method for detecting clones with forward and/or reverse 
integration orientations (Figure 5B). 

The chromosomal locations of representative ZFR- 
mediated integration events were determined to confirm 
that donor plasmid integration occurred at multiple 
chromosomal locations (Figure 6). A plasmid rescue assay 
was performed in which the genomic DNA of the poly- 
clonal puro R /G418 R ZFR-treated HEK293 cells was 
digested with the restriction enzymes Nhel, which cuts 
inside the donor plasmid and Xbal, which is expected to 
cut in the chromosomal DNA near the transposon inte- 
gration site. Self-ligation of the digestion products resulted 
in a new plasmid that contained sequences from the 
piggy Bac transposon, donor plasmid and flanking chromo- 
somal DNA. This plasmid can be transformed and pro- 
pagated in E. coli because of the kan R cassette and pl5A 
origin of replication within the transposon. Plasmids con- 
taining the donor fragment were distinguished from 
non-targeted transposons by colony PCR and sequenced 
with transposon- and donor-specific primers to determine 
chromosomal location. Representative sequences show 
that the piggyBac target sites were on different chromo- 
somes (Figure 6). The recovered events occurred within 
genes, which is a preferred region for piggyBac transpos- 
ition (8). These results show that the ZFR was active at 
several chromosomal locations in human cells. 



DISCUSSION 

This study demonstrates the ability of ZFRs to direct the 
targeted chromosomal integration of a plasmid at specific 
recognition sites distributed throughout human and 
mouse genomes in several cell types. As measured by the 
fraction of stable transfectants containing successfully 
targeted integration events (Figure 5A), ZFR specificity 
is significantly greater than any of the enzyme- or 
viral-mediated vector integration strategies described to 
date. In our previous study using an isogenic HEK293 
cell line with a ZFR target site at a single accessible 
locus, we observed targeted integration in 55 of 56 
clones (40). We have now observed similarly high levels 
of specificity into polyclonal targets in 19/24, 19/20 and 
20/20 clones for HEK293, NIH3T3 and HuH-7 cells, re- 
spectively (Figure 5A), as well as detectable site-specific 
integration in HeLa cells (Figure 4). For comparison, a 
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Figure 5. Clonal analysis of ZFR-mediated plasmid integration events. Single cell colonies were isolated by limiting dilution from polyclonal G418 R / 
puro R cell populations that had been transfected with donor plasmid and ZFR expression plasmid. (A) Genomic PCR was used to detect the 
unmodified ZFR target site on the piggy Bac transposon and/or the integration of the donor plasmid at this site in either forward or reverse 
orientations. (B) Southern blot of unmodified HEK293 cells, G418 R HEK293 cells containing the piggy Bac transposon, and the first six G418 R / 
puro R clonal HEK293 cell populations was performed with a probe against the neo R cassette. Distinct bands correspond to the unmodified trans- 
poson and forward and reverse orientations of the correctly targeted donor plasmid integration (Figure 1A and C) and coincide with the PCR results 
in (A). 
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Figure 6. Mapping the chromosomal locations of donor plasmid integrations into the transposon target site. Genomic DNA of the polyclonal 
G418 R /puro R HEK293 cell populations was digested with Nhel and Xbal, circularized by ligation and transformed into E. coli. Purified plasmid was 
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previous study which inserted the 39-bp target site of the 
phiC3 1 integrase into the genome of HEK293 cells showed 
only 14/96 clones contained site-specific integration events 
(24), with the other 82 clones likely containing integration 
events at any of the other > 100 genomic sites that are also 
recognized by this enzyme (26). 

The high frequency of targeted integration events in the 
clonal cell populations (Figure 5) does not eliminate 
the possibility that off-target integration events also 
occurred in these cells. However, given that only ~1% 
of the total transfected cell population contains an inte- 
gration event (40), we expect it is unlikely that high levels 
of ZFR-mediated off-target integration are occurring. 
Furthermore, ZFR activity increased ~3-fold when the 
transposon target site was pre-introduced in the genome 
of NIH3T3, HeLa and HuH-7 cells (Figure 3A). Plasmid 
integration by the ZFR in the absence of transposon 
target sites was only marginally higher than background 
plasmid integration in these cells (Figure 3A), suggesting 
that many off-target integration events in our experiments 
were the result of random plasmid integration that was 
not ZFR-mediated. Nevertheless, additional studies dedi- 
cated to thoroughly and quantitatively characterizing 
potential off-target recombination and ZFR-DNA inter- 
actions are necessary to understand the full potential of 
this technology and its values relative to methods that 
enhance gene targeting by homologous recombination 
(37,46,47). 

Commercial systems are available for cell line engineer- 
ing by targeted plasmid integration. For example, the Flp- 
In M and Jump-In™ systems marketed by Invitrogen 
make use of the Flp recombinase and phiC31 integrase, 
respectively. As described above, Flp activity is not sig- 
nificantly greater than background levels of plasmid inte- 
gration and phiC31 does not have the intrinsic sequence 
specificity to recognize a single site in mammalian gen- 
omes. Therefore these commercial systems account for 
this lack of activity and specificity by dividing a single 
expression cassette for an antibiotic resistance gene 
between the genomic target site and the donor plasmid 
such that only plasmid integration at the target site will 
reconstitute this cassette (48,49). In contrast, we readily 
recovered cells with targeted integration events by select- 
ing for any plasmid incorporation into the genome by 
incorporating the complete puro R cassette on the donor 
plasmid. The ability to target multiple sites within the 
same cell is further evidence of the high level of ZFR ef- 
ficiency and specificity. 

The two-step strategy for genome engineering described 
here (Figure ID) will be directly useful for a variety of 
applications in stem cell-based gene therapy, cell line en- 
gineering, genetic engineering for biopharmaceutical pro- 
duction and animal transgenesis. For example, the 
polyclonal population of transposon-modified cells can 
be screened for individual clones in which the ZFR 
target site has been integrated into a favorable genomic 
locus, as was recently demonstrated in the creation of 
induced pluripotent stem cells with the piggyBac trans- 
poson system (50) and the screening of lentiviral safe 
harbor integration sites (51). The favorable locus may be 
defined by the degree of interactions with endogenous 



genes, the distance to neighboring genes, lack of silencing 
effects, high levels of gene expression, or other factors 
(51). The ZFR could then be used to integrate transgenes 
specifically at this locus in the clonal transposon-modified 
population. This two-step approach is advantageous 
because many different transgenes can be added to the 
transposon-modified cell source in parallel, all of which 
would be targeted to the same favorable locus. An analo- 
gous method has been proposed for using piggyBac and 
Cre recombinase for the genomic analysis of transgenic 
mice (52). A similar approach has also been used with 
phage integrases to engineer the genomes of cell lines, em- 
bryonic stem cells and whole organisms (49,53). However 
all of these systems require additional levels of selection to 
compensate for insufficient enzyme specificity, as dis- 
cussed above. Further investigation of ZFR activity in 
primary cells, embryonic stem cells, induced pluripotent 
stem cells and animals will elucidate the advantages and 
limitations of ZFRs in these settings. 

Several other studies have investigated targeted gene 
addition by fusing engineered zinc-finger proteins or 
other targeted DNA-binding proteins to transposases 
(14,15) and retroviral integrases (54,55). Although these 
strategies have been successful in directing integration 
and transposition, the vast majority of integration events 
occur at locations other than the target locus. 
Importantly, these fusion proteins were not engineered 
to abrogate the integration activity of the parent enzyme. 
Consequently, any rare targeted integration events occur 
amidst many more semi-random events. This approach to 
protein engineering is in contrast to our development 
of ZFRs, in which the serine recombinase catalytic 
domain is only active when fused to a DNA-binding 
protein (40). We expect that this modular structure and 
function of ZFRs contributes substantially to their precise 
specificity. 

Ultimately, we envision the design of ZFRs that specif- 
ically target endogenous genome sequences, enabling the 
direct integration of plasmid DNA into any natural locus. 
The ZFR used in this study is composed of a fusion 
between a designed four-finger zinc-finger protein and a 
hyperactive catalytic domain of the Gin invertase from 
bacteriophage Mu (40). Synthetic zinc-finger proteins can 
be engineered to target almost any DNA sequence (35,36). 
However, the Gin catalytic domain retains a strict 
sequence specificity for its native recognition sequence in 
the bacteriophage genome, thus preventing the repro- 
gramming of ZFR activity solely through exchanging the 
zinc-finger DNA-binding domain (40,42). To facilitate the 
use of ZFRs at diverse target sequences, we have used 
directed evolution to alter the sequence specificity of 
ZFR catalytic domains (42,45). In particular, we have re- 
cently described a structure-guided approach to directed 
evolution that facilitates the engineering of domains with 
high levels of activity exclusively on new targets (56). We 
anticipate that this work will lead to the development of 
ZFRs that integrate plasmids efficiently and specifically at 
any genomic site of interest. This work has the potential to 
transform customized genome engineering into a facile 
and routine procedure. 
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